Counting young birds: A simple tool for the determination of avian population parameters

Population parameters are usually determined from mark-recapture experiments requiring laborious field work. Here, we present a model-based approach that can be applied for the determination of avian population parameters such as average individual life expectancy, average age in the population, and generation length from age-differentiated bird counts. Moreover, the method presented can also create age-specific results from lifetime averages using a deterministic exponential function for the calculation of parameters of interest such as age-dependent mortality and age distribution in the population. The major prerequisites for application of this method are that young and adult birds are easily distinguishable in the field as well as the existence of sufficiently large data sets for error minimization. Large data sets are nowadays often available through the existence of so-called “citizen science” databases. Examples for the determination of population parameters are given for long-living migratory birds which travel as families in large groups such as the Common Crane and the Whooper Swan. Other examples include long-living partially migratory birds staying together in large flocks which do not travel as families such as the Black-headed Gull, and also short-living songbirds where at least from one sex young and adult birds are easily differentiable such as the male Black Redstart.


Introduction
This Supporting Information to the manuscript "Counting young birds……" describes the mathematical background for the analyses provided in the manuscript. It also explains how to apply these computations to other or new data using Excel or R.

General notations and definitions
In the following, the term population refers to birds of a specific species in a larger region during a certain period of time. This can encompass migratory birds encountered in their stopover places (e.g. Common Cranes from Scandinavia resting in the region of the Darß-Zingster Boddenkette and Rügen) as well as a local breeding population (e.g. local Black Redstarts). The most important parameter is the proportion "g" of young birds, already introduced in the main text. This implies that the average life expectancy "L" of a bird in a stationary population is given by The situation is more complex in non-stationary populations. Also, other parameters such as the mean age "A" cannot be computed easily in non-stationary populations due to the dependency on individual mortality.

Population / Number of birds "N"
The population is considered year-wise. The start point of each counting year is usually not the first of January but the month when the counting starts, e.g. during migration when young and adult birds can still be well distinguished. In the case of Common Cranes, this would be October.
The number of birds in a population in a specific year "j" (e.g. "j=2018") is denoted by "Nj". Then is the growth of the population from year "j-1" to the following year "j". A negative growth means that the population is shrinking.

Population increase
Given the proportion "gj" of young birds in year "j", the population increases by Increased mortality of young birds is not considered, when counting occurs after fledging or in a later period of time. Directly after hatching the portion of young birds in a population will be larger resulting in a lower average life expectancy in this population because of the enhanced mortality of fresh hatchlings or very young birds.

Population decrease: Mortality "M"
Mortality leads to a decrease of the population. In contrast to the proportion of young birds, mortality cannot be observed directly, but must be estimated from the counts of young and adult birds. Define the mortality as Mi,j = probability that a young bird born in year "i" dies in year j>i F4 Because the counting is carried out only once -or during a defined period -per year, the death of birds dying between the counts of year "i" and year "i+1" must be assigned to either year "i" or "i+1". Here we assign it to year "i+1", implying that "Mi,i = 0" holds. This can lead to a slight overestimation of the life expectancy.
Assuming that the mortality depends only on the age of the bird, then the probability that a bird dies at age "a" is denoted by As explained above "M(0) = 0".

Life expectancy "L"
Given the mortalities "M(a)" for each year, these probabilities must sum up to 1 and the average life expectancy "L" can be computed from the probabilities.
The residual life expectancy "Le(a)" of a bird that has reached already age "a" is given by the conditional expectation.
Where "k" corresponds to the additional years a bird of age "a" still survives. The cumulative mortality "cM", i.e. to die before age "a+1" or the probability that a bird lives for at most "a" years, is The survival probability "S(a)" is then 2.2.6. Yearly or relative (conditional) mortality "rM(a)" The yearly or relative (conditional) mortality "rM(a)" is the probability that a bird dies at age "a+1" given it survived until age "a".
2.2.7. Population growth and growth rate "r" The population growth, defined in Eq. F2, which can also be negative if more birds die than are born, can be computed as the number of young birds given in Eq. F3 minus the number of old birds that die in the corresponding year.
In order to avoid the absolute population numbers that are difficult to determine we consider relative quantities.
From Eq. F13, the yearly growth "r" rate can be inferred simply by dividing "Nj-Nj-1" by the total number of birds of the population "Nj-1" in year "j-1".
Changes of the population caused by migration can be neglected if out-and in-migration are balanced or if the observational area is sufficiently large.

Basic equation for population dynamics 2.3.1. Simplifying assumptions
Given the proportions "gi" of young birds and the yearly mortalities "Mi,j" are known and initial values for the past years are available, Eq. F13 provides a means to compute the number "Nj" of birds in the population for the succeeding years. Usually these parameters are unknown. But with the following simplifying assumptions one can still derive estimates for the number "Nj".
-The proportion "g" of young birds remains constant over a longer period or can be replaced by its mean. -The growth rate "r" remains constant over a longer period or can be replaced by its mean.
-Mortality "Mi,j" depends only on the age.
With these assumptions, Eq. F14 implies the basic equation for population growth (see also [1 ]). Given the growth rate "r", the population increases after "x" years by the factor "(1+r) x ". In order to calculate the time that it takes until the population has doubled, "(1+r) x =2" must be solved, leading to the doubling time "D".

Reformulations of the basic equation for population growth
Solving Eq. F15 for the proportion "g" of young birds leads to and with Eq. F11 we obtain The average life expectancy "L" is then with Eq. F7 For the special case "r=0" Eqs. F18 and F19 imply F20 In a stationary population with "r=0" the average life expectancy "L" is the reciprocal of the proportion of young birds in the population "g" independent of all other parameters. In a shrinking or growing population "L" does not only depend on "g" but also on the mortalities. If "r" is small, Eq. F20 also holds approximately.

Age parameters in a population
2.4.1. Age distribution "P(a)" The age distribution can be derived from the following thoughts concerning the population in year "j" where "N(j)" denotes the total number of birds in the population in year "j".
The factor "S(a)" takes into account that the corresponding birds have to survive at least until age "a".
According to Eq. F14 we have for a constant growth rate "r"

F21
The proportion "P(a)" of age "a" years is then which gives the age distribution "P(a)" under the assumption of a constant growth rate "r".
We obviously have that these probabilities are normalised, i.e.

Mean age "A"
The mean age in the population is then There is no unified definition of relative mortality in the population "rP(a)". It refers to the relative change of the number of birds for the specific age groups.

F25
In a stationary population "r=0", "rP(a)" is equal to the mortality "rM(a)". However, for a non-stationary population, "rM(a)" referring to the probability for a bird of age "a" to die will not change, whereas "rP(a)" depends on the growth rate "r".

Generation length "G"
In population biology, the generation length "G" refers to the average time between two succeeding generations. It corresponds to the mean age of the breeding birds. Let "B1" denote the (mean) age when birds breed the first time and "B2" the (mean) age when birds breed the last time, then the generation length "G" is given by The numerator in Eq. F26 corresponds to the mean age of the breeding birds. The normalisation by the denominator is necessary because the breeding birds do not represent the full population.
(If the breeding success rates σ(a) for single years a are known, then P(a) in Eq. F26 can be replaced by "σ(a)*P(a)". One could even drop the boundaries B1 and B2 by setting "σ(a)=0" for birds that have not reached the breeding age.)

Types of mortality distributions
So far, it was assumed that the mortality "M(a)" is known. In other publications (e.g. [1]), different mortality distributions were considered.
The simplest distribution assumes that all birds reach the same age "L", meaning that "M(L)=1" and "M(a)=0" for "a≠L" is a degenerative distribution. Although this distribution simplifies the computations significantly -most of the sums have just one non-zero term -it is quite unrealistic.
A more realistic assumption is that the causes of death for wildlife birds are e.g. predation, shooting, extreme weather, accidents and infections. It is assumed that causes usually strike the birds independent of their age, so that each year a certain proportion of birds dies from all age groups. This leads to a geometric distribution where the mortality has an exponential form "M(a)~x a (x<1)". In this case, the yearly mortality is independent of the age.
The parameter "x" can either be derived from Eq. F15 or as a function of the proportion of young birds "g" and the growth factor "r" in Eq. F18.
The problem of this mortality model is that the maximum age is unlimited and permits arbitrary old birds although with a very low probability. Therefore, we will also consider a truncated geometric distribution.

Degenerative mortality
All birds are assumed to die at age "a=L". According to Eqs. F5, F10 and F22, mortality "M(a)", survival probability "S(a)" and the proportion "P(a)" of birds of age "a" in the population have the following values.  Based on Eq. F15, the growth rate is then "r=g*(1+r)-g*(1+r) 1-L "and solving for "g" and "L", respectively, one obtains

F28
The equation on the left-hand side computes the proportion of young birds depending on the growth rate "r" and the age "a" at which all birds die, in this case the maximum age "L". The equation on the right-hand side derives the (maximum) age "L" from the observed proportion "g" of young birds and the (constant) growth rate "r".
For a stationary population "r=0", these equations lead to a division by zero. But when we take the limit "r→0", we obtain the already mentioned relations "g=1/L" and "L=1/g", respectively.
Formally speaking, "L" is assumed to be a positive integer number. Eq. F28 will usually lead to a non-integer number for "L(g,r)". This can nevertheless be used in the sense of an approximation or interpolation.

Geometric distribution (Expo)
When mortality follows a geometric distribution M(a)=c*x a for a>0 F29 the parameters "c" and "x<1" must still be determined. Writing a geometric distribution as in Eq. Inserting "S(a)=x a " in Eq. F18, yields for the proportion of young birds "g" From the observed proportion of young birds "g" and the growth rate "r" one can solve this equation for the parameter "x" of the geometric distribution, yielding x = (1-g)*(1+r) and L = 1/(1-x) = 1/(g-r+r*g) F32 Again, for a stationary population with "r=0" we obtain the known relation "L=1/g".
The following box summarizes the formulae for the parameters of interest in this model depending on the observed proportion of young birds "g" and the (constant) growth rate "r".
The last breeding age "B2" is assumed to be unlimited here.

Application example 1: Average life expectancy "L"
We consider a population with a proportion "g=0.2=20%" of young birds. Eq. The influence of the growth rate "r" is stronger for a geometric mortality distribution in comparison to a degenerated distribution.

Application example 2:
Mortality "M(a)", survival probability "S(a)", and age distribution "P(a)" For "g=0.2" and "r=0" the survival probability for age "a=25" years is at the negligible level of 0.4%. The average age "A" in the population is 4 years and the generation length "G" is 7 years assuming that the first breeding age "B1" is at 3 years. The mortality "M(a)", survival probability "S(a)", and age distribution "P(a)" are shown in Fig. 1.

Fig. 1:
Mortality "M(a)", survival probability "S(a)", and age distribution "P(a)" depending on the age "a" for a proportion of young birds "g=0.2" corresponding to an average life expectancy of "L=5" years (for "r=0").
All age values start at the counting time of the birds not at the hatching time. The average life expectancy "L" of hatching birds is smaller because hatchlings and very young birds have a higher mortality compared to older birds. This higher mortality between hatching and counting is not taken into account in all these computations.
In the main manuscript we will also consider an example where we take the higher mortality of very young birds into account (Black Redstarts as example).

Application example 3: Varying proportion of young birds
All computations in Eq. F15 assume that both the proportion of young birds "g" and the growth rate "r" are constant over a longer period of time or can be replaced by their mean values. The following example illustrates what happens when the proportion of young birds "g" starts to vary from a certain year on. W e assume that "g" is constant at the level 0.2 until year 0 and that mortality follows a geometric distribution with an average life expectancy of L=5 years. Starting at year 1, "g" starts to vary randomly between 0.1 and 0.3 around the mean value of 0.2. According to Eq. F13, the population change can be computed iteratively after year "j=0" where "Mi,j = M(i-j) = M(a)" as in Eq. F33. The initial population size in year "j=0" is "N0=100". The population size is computed for the years "j=0,…,10". The result depends on the specific random numbers. Fig. 2 shows an example with quite extreme fluctuations of "g".

Fig. 2
Yearly proportion "gj"of young birds (randomly generated between 0.1 and 0.3 with mean "g0=0.2"), estimated population sizes "Nj" and growth rates "rj"changing over the years "j" for a specific configuration of random values "gj" with "gm=0.223" as the sample mean of the proportion of young birds, "Nlin(j)" the linearised population size and "rm=0.032" the sample mean of "rj".
If such fluctuations of "g" occur in reality, one should interpret the model with care. Nevertheless, even here the means "gm=0.223" and "rm=0.032" are quite close to the "true" value 0.2 and 0, respectively. And the average life expectancy "L" is estimated almost correctly with "Lm=1/(gm-rm+gm*rm)=5.047" with an error of 0.047 years which is negligible in comparison to other uncertainties that will lead to errors in the range of months.
Further investigations in this direction have shown that the model is quite robust against fluctuations of "g". Thus, variations of "g" can therefore be replaced by the mean value "gm" without losing much precision. However, systematic changes -e.g. caused by climate change -require further considerations.

Modified geometric mortality distribution (ModExpo)
The geometric distribution brings two obvious disadvantages. a) For the first year of life (starting from the counting time) there is no increased mortality considered. b) The theoretical maximum age is infinite although unrealistic ages will have a very low probability. Therefore, we modify the geometric distribution and introduce two additional parameters "M1" and "amax". The variables that we need to know for our model are now: • "g": proportion of young birds.
• "amax": The maximum possible (or maximum) known age for the birds.
In the range from "a=2" to "a=amax" the exponential law "M(a)=c*x a "is still valid, i.e. that the individual mortality is the same for all age groups except for the first year.

Numerical procedure
This more realistic model does not permit closed form solutions anymore as they were provided for the simpler case in Eq. F33. Therefore, numerical solutions are required. These numerical solutions can for instance be computed using the Excel function "Solver". This module is freely available for Excel, but must be activated first (see the instructions in the Excel file Supporting Information S2).

Practical computation with EXCEL
The order of computations is the same as in the examples in Section 3.2.
For the mortality, the constant "c" (see Eq. F29) is first calculated from "M(a)=c*x a " with "M(0)=0" and "M(1)=M1" and "M(a)=0" for "a>amax". Based on the mortality "M(a)", the survival probability "S(a)" can be computed according to Eq. F10 both depending on the parameter "x". This parameter is then estimated based on Eq. F18 given the observed value of "g" and "r". Then the other relevant variables such as the average live expectancy "L", the age distribution "P(a)", the yearly mortality "M(a)" and others can be calculated.
An example for an Excel sheet is shown in Fig. 3.

Fig. 3:
Excel sheet "modExpo-M1-amax.xlsx" with input values "g", "r", "M1" and "amax"in blue bold-faced font (upper left) and the most important results in red bold-faced font on the right-hand side. The age groups from "a=4" to "a=16" are omitted due to limited space (for details see text).
The constant "c" is the normalisation factor guaranteeing that the sum of mortalities is 1. The value for "x" must be defined by the user so that Eq. F18 is satisfied. Otherwise there will be a mismatch between the computed "g-value (red)" and the given "g-value (blue)". The value for "x" can always be found by a simple trial and error procedure or -a better option -by using the Excel "Solver".
The results for the simulation are: • Survival probability "S(a)".
(The Excel program with short instructions is available as Supporting Information S2) The most important results (in red bold-faced font) are the average life expectancy "L" (excluding the time after hatching and before counting), the average age "A" in the population and the mean generation length "G". The simple model based on the unmodified geometric distribution based on "g=0.2" and "r=0.03" would yield "L=5.682" (instead of 5.622), "M1=0.176" (instead of 0.2), "A=4" (instead of 3.751) und "G=7" (instead of 6.458) according to Eq. F33. The differences are mainly caused by the increased mortality "M=0.2" in the modified geometric distribution (instead of "M=0.176") and partly by the truncation at the maximum age "amax=20". Taking the general uncertainties caused by data collection into account, these differences are more or less negligible.
Larger differences between the geometric distribution and its truncated version in Eq. F33 can only be expected when the survival probability after the assumed maximum age "amax" is still significantly larger than zero in the unmodified geometric distribution. In this case, the unmodified geometric distribution could lead to birds of an unrealistic old age with a non-negligible probability. In the above example, we would have a small probability of "S(20)=0.0208" for the unmodified geometric distribution, which has a negligible influence on the population parameters computed in Eq. 33. In such cases, the additional effort to utilize the modified geometric mortality distribution might be not worthwhile. For birds with a longer life expectancy (e.g. Common Cranes), larger differences are possible.

Background information bootstrapping
Bootstrapping is performed to test how reliable the mean value of the observed proportion of young birds is over a period of time. First, the mean value of the observed proportion of young birds "g" is calculated based on the input file with the counted birds per year (found as an example for the Black-headed gull in Supporting Information S4). The program ModExpo needs in addition to the value of "g", the values for the growth rate per year "r", the mortality in the first year "M1", the maximum (known) possible age depending on the species "amax", the first (mean) breeding age "B1" which are specified by the user. Then, the bootstrapping length "n"the number of bootstrap samples -is defined with n=10,000 and the confidence level "ci" with a value of ci=95 %. But the number of bootstraps and the confidence level of the confidence interval can be adjusted if necessary. For the probability within the bootstrapping, the number of young birds and the corresponding number of total (young and adult) birds in each year is determined. The bootstrapping leads to a simulation of the parameters for the average life expectancy "L", the mean age in a population "A", and the generation length "G". For each of the parameters the specific confidence interval is determined as the corresponding quantiles of the parameters computed in the bootstrap samples.
A single bootstrap sample is generated in the following way. For each year the specified number of juvenile and adult birds is taken as set from which a sample of the same size is drawn with replacement. This leads to a bootstrap sample similar to the original data table with random deviations. For each of the bootstrap samples the parameters for the average life expectancy "L", the mean age in a population "A", and the generation length "G" are calculated.
3.4.2. Instructions for using the "ModExpo" method in R This method is referred to Chapter 3.3 "Modified geometric mortality distribution (ModExpo)". It uses the formulas described in Chapter 2 "Background for the computations". The method is implemented in R and the input variables are as follows: • Observed proportion of young birds "g", • Yearly population growth rate "r", • Mortality in the first year, i.e. "M1=M(1)"; "M1=0 means "auto" • The maximum possible (or maximum) known age for the bird species "amax", • First (mean) breeding age, "B1" The open source statistics software R (see https://www.r-project.org/) must be installed on the computer. The R file provided (Supporting Information S3) can be opened with any text editor. At the top of the file, values for the above parameters are given (as example for the Black-headed Gull with "r=0", "M1=0", "amax=30", and "B1=3"). These parameters should be changed for other bird species! Once these parameters are adjusted, one can simply copy and paste the full content of the text file to the R console. It should be noted that the R-package "readxl" will be installed requiring internet connection. This package needs only to be installed the very first time the program is running.
R will then open a file reader window in which the desired Excel file with the bird counts can be selected. The file must be in the same format as the example file (Supporting Information S4, including the counts of young and adult birds with the example of the Black-headed Gull).
After the calculation, all results are displayed on the console in R. First, the above mentioned determined values of the input file are shown and then the input values specified by the user. The input values "x" and "c" are calculated and needed to have very close values of the entered or file-specific value "g" and the calculated value "g" by "ModExpo". The results of "ModExpo" are observed proportion of young birds "g" (should be very close the input value "g"), the average life expectancy "L", the mean age in a population "A" and the generation length "G". If applying bootstrapping, the 95%-confidence intervals are shown for the values "L", "A", and "G" based on the defined bootstrapping length. Additionally, some control values are shown where the values of "Sum of M(a)" and "Sum of P(a)" have to be very close to 1.

Other approaches and further perspectives
In earlier publications ( [1][2][3]) other models for population dynamics were investigated based on ideas from mortality tables where mixtures of exponential and normal distributions are considered. The general procedure was always the same. Once the mortalities M(a) with at least one free parameter were specified, the free parameter or parameters was or were determined based on Eqs. F15 or F18, given the proportion "g" of young birds and the growth rate "r". All other parameters of interest can then be calculated using Eqs. F15 -F26. The average life expectancies "L" were always between the extreme of the degenerated distribution in Section 3.1 and the (unmodified) geometric distribution in Section 3.2.
The drawback of these earlier approaches was mainly the higher computational effort.
Another possible approach could model mortality "M(a)" by a function that leads to a linearly decreasing survival probability "S(a)" instead of an exponential one. Fig. 4 shows an example with a proportion "g=0.2" of young birds, a stationary population "r=0" and a maximum age of "amax=9" years. The maximum age "amax" was determined in such a way that Eq. F18 yields the given value of "g".

Fig. 4:
Mortality "M(a)", survival probability "S(a)" and yearly mortality "rM(a)" depending on the age "a" (here from 0 to 10) for a proportion of young birds "g=0.2", a growth rate of "r=0", and a maximum age of 9 years "amax=9".
For this linear survival probability "S(a)", the yearly (relative) mortality "rM(a)" increases with age. This would be plausible when all or the majority of birds would reach the biological maximum age "amax". However, wildlife birds seldom reach this maximum age "amax" because they usually die before (e.g. predation, accidents, diseases). Therefore, the exponential survival probability might be a more realistic model consistant with a constant mortality rate independent of the age (except for the higher first year mortality).